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ENTROPY,  ECONOMICS,  PHYSICS 

*) 

Jacob  Marschak,  University  of  California  at  Los  Angeles 

I.  As  pointed  o ’t  by  C.  Brumat , no  extremizat ion  occurs 
in  the  physicists'  (e.g.  E.  Schroedinger ' s)  derivation  of 
entropy.  With  p =df  (p1,...,pm). 


H p>  =df  - * pi  ln  pi  - 


= li 


m^  ^ { [ In  P(p;N)]/Nl  + ln  m , 


where  P(p;N)  Probability  that  Np^  entities  are  in  state 

m m Np. 

i ( i=l , . . . ,m)  » 'N'./  |“p]  (Np± ) ] • |-j“J  7J i 1 ; 


provided  prior  uniformity  is  assumed,  i.e.,  710  = 1/m,  all  i. 


*)  Prepared  for  the  International  Seminar  on  "Collective 

Phenomena  and  the  Applications  of  Physics  to  Other  Fields 
of  Sciences"  planned  to  be  held  in  Moscow.  July  1974,  with 
the  participation  of  dismissed  Jewish  Soviet  scientists. 

If  the  Seminar  cannot  be  held  the  paper  will  be  submitted  to 
the  North-American  meeting  of  the  Econometric  Society. 

San  Francisco,  December  1974.  — Acknowledgements  are  due  to 
the  Alexander  von  Humboldt  Foundation  and  the  U.S.  Office  of 
Naval  Research. 
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II.  By  contrast,  in  "information  theory",  entropy  is 
derived  by  extremization : 


m 

H(P)  = _ (“V w ? PiWi  ' ln  l)/b 

1 ml 

subject  to  the  decodability  condition 

Vwi  < 1 • 

where  w^=  number  of  digits  in  the  code  word  encoding  the 
i-th  sequence  ("block")  of  b states;  and  r=  size  of  code 
alphabet.  H(p)  thus  does  measure  "disorder":  the  descrip- 

tion of  a crystal,  a circle  can  be  encoded  in  fewer  parameters 
than  a scatter  of  points.  But  do  physicists  relate  entropy 
to  disorder  via  efficient  coding? 


Ill . Thus  H increases  with  the  length  of  an  economi- 
cally coded  message,  and  hence  with  the  expected  cost 


of  storing  and  transmitting  information;  but  not  with  the 
cost  of  collecting  it;  nor,  given  the  cost,  with  the  expected 
gain  to  the  information  user  (as  seems  to  be  implied  by  H. 
Theil?) . The  la :ter  depends,  not  only  on  p,  but  also  on  the 
"benefit  function"  (3  (of  actions  a.  and  states  z.).  In 
particular,  the  expected  gain  from  perfect  information  about 
the  z.  is 

l 

9p(p)=df  2 pi  max..  p(a^,zi)  - max^  2 3(a;.,zi)  > 0 . 


* 


"} 


The  function  g^(«)  is  concave  in  p but  not  necessarily 
symmetric.  (The  same  is  true  of  imperfect  information). 

IV.  "intuitively" , for  any  "uncertainty  function"  U(p) 

(a)  U is  symmetric; 

(b)  U(p)  < U(l/m, . . . , 1/m) 

(c)  perfect  information 

= u(p)  - U(1,0, . . . ,0)  is  non -negative  . 

Now,  these  properties  are  shared  by  H with  all  other 
symmetric  quasi-concave  functions  (or,  extending  (c)  to  im- 
perfect information,  with  all  symmetric  concave  functions: 

De  Groot)  . As  to  the  additivity  property 

(d)  H(p,q)  = H(p)  + H(q)  for  p,q  independent: 

it  is  exclusive  to  H but  is  relevant  only  to  message  length. 
(Note:  transportation  and  ware-housing  costs,  too,  are  addi 

tive  and  are  independent  of  the  benefits  and  production 
costs  of  goods  moved  and  stored'.)  . To  compare  and  order 
(rather  than  to  measure),  additivity  is  not  required.  W. 
Hildenbrand,  and  H.  Paschen,  H.  Theil,  and  others,  used 
entropy  to  compare  degrees  of  "concentration"  (e.g.,  of  an 
industry)  and  were  criticized  by  P.  Hart  on  empirical 
grounds . 


i 
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V*.  Now  define 
m 

X Z p^x^  = fixed  average  income ; and, 

with  x = , , (x.  , . . . . x ) fixed, 
dr  1 m 

H*  (X,x)  =df  H(p(X,x))  =df  maxp  H(p) 
subject  to  Zp^x^  = X. 

Then  (with  a Lagrange  X depending  on  X,x) 

(X , x)  i z 

p.  = e / const . 

Such  an  income  distribution  (F.P.  Cantelli)  is  hardly 
realistic  (prior  uniformity  was  assumed  in  I.  above'.).  But, 
if  X is  interpreted  as  average  energy  and  X as  inversely 
proportional  to  absolute  temperature  T,  the  defining  property 
of  the  original,  non-stochastic,  concept  of  physical  entropy 
obtains : 


dH*(X,x)  = dQ/T,  where 

dQ  =d£  dX  - Z p^  dx^  =df  heat  supply. 

Detailed  analogies  with  both  information  theory  and  a theory 
of  income  distribution  have  been  proposed  by  H.  Reiss. 
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VI.  The  constrained  maximization  of  H,  as  above,  was 
proposed  by  E.  Jaynes  and  M.  Tribus  for  the  case  of  limited 
knowledge  of  prior  probabilities  and  was  recently  applied 
to  stock  market  analysis  { J . Cozzolino  and  M.  Zahner;  D. 
Griffith  and  J.  Snell)  . This  approach  should  be  compared 
with  the  "Laplace"  assumption  of  uniformity  over  the  proba- 
bility space,  whether  constrained  with  respect  to  its  dimen- 
sionality n of  p = (p^,...,p  ) or  to  some  other  property. 
Examples : 


Constraint:  Estimate  of  p: 


Laplace 

Jaynes 

m — 2 , 

(1/2, 1/2) 

(1/2, 1/2) 

m=2 ; 

px>  2/3. 

(5/6, 1/6) 

(2/3, 1/3) 

rn=3; 

Pl<  2/3 . 

(1/3, 2/3) 

d/2, 1/2) 

m=3 . 

( 1/3 , 1/3  ; 1/3) 

(1/3, 1/3,  1/3) 

m=3? 

p2=  P3 

(1/2, 1/4, 1/4) 

(1/3, 1/3. 1/3) 

m=3 ; 

pl+  2p2+  4p3=  2 

(1/3, 1/2, 1/6) 

( .43, .35, .22) 

approx . 

The  point  in  the  (constrained)  probability  space  that  is 
obtained  by  the  "Laplace"  approach  can  be  regarded  as  the 
mean  of  the  distribution  of  probability  distributions. 
Superficially,  the  "Jaynes"  approach  might  be  said  to  deter- 
mine its  mode,  and  thus  have  the  .dvantage  of  invariance 
under  transformations  of  the  random  variable.  However,  as 
stated  at  the  beginning  of  this  paper,  entropy  is  related 
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to  the  probability  of  a given  distribution  under  the  assump- 
tion of  prior  uniformity  over  the  set  of  values  (called 
"states")  of  the  random  variable:  = 1/m.  This  removes 

the  invariance  of  the  mode. 

VII.  T another  context  of  statistical  inference,  maxi- 
mizing a generalized  entropy  amount  ("discrimination 
information")  was  recommended  by  S.  Kullback  and  has  been 
applied  to  econometric  estimation  by  G.  Tintner  and  M.V. 

Rama  Sastry. 


VIII.  Application  of  the  H-formula  to  interregional 

economics  was  also  made  (A.  Charnes  et  al.) . It  was  criticized 
by  S.  Hansen  and  by  M.  Beckmann. 

IX.  H measures  the  exrected  "optimal  incentive  to 
forecaster"  under  conditions  (I.J.  Good)  which,  however,  were 
shown  to  be  very  special  ones  ( J . McCarthy:  A.D.  Hendrickson 
and  R.J.  Buehler) . 

X.  Finally,  not  as  a mathematical  analogy  but  as  a 
physico  -sociological  fact,  N.  Georgescu-Roegen  has  applied  to 
the  environment  of  industrial  societies  (as  Schroedinger  did 
to  that  of  an  organism)  the  above  physical  .elation  between 
heat  supply  and  entropy  increment,  and  the  implied  law  of  in- 
creasing entropy.  He  insists  on  its  nonstochastic,  hence  more 
impatiently  pessimistic,  version. 
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ABSTRACT 

The  entropy  formula  of  information  theory  measures,  in 
the  limit,  the  minimum  expected  number  of  symbols  needed  to 
store  or  transmit  a long  sequence  of  decodable  messages. 

It  is  not  related  to  other  quantities,  also  relevant  to  the 
cost,  or  relevant  to  the  benefit,  of  information.  And  if  it  is  at 
all  useful  to  compare  degrees  of  "uncertainty" , any  concave 
symmetric  function  on  probability  space  has  the  "intuitively" 
desirable  properties. 

The  number  of  required  symbols,  e.g.,  of  specified 
"parameters",  does  measure  "disorder",  said  to  characterize 
physical  entropy.  However,  in  contrast  to  the  minimum  ex- 
pected message  length,  the  entropy  of  statistical  physics 
is  not  derived  by  ext remizat ion ; it  is,  in  the  limit,  re- 
lated to  the  probability  of  a given  allocation  of  states 
among  a large  number  of  entities. 

The  most  probable  allocation  (and  thus  maximum  entropy) 
which  implies,  in  physics,  the  basic  relation  between  tem- 
perature and  the  changes  of  heat  and  entropy,  has  been 
sometimes  interpreted  as  the  most  probable  income  distri- 
bution. Maximization  of  entropy  has  also  been  proposed  for 
statistical  inference,  without  clear  justification. 

All  this  must  be  distinguished  from  the  study  of 
physical  entropy  in  organized  human  agglomerations. 
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